5 research outputs found

    Model Compression Methods for YOLOv5: A Review

    Full text link
    Over the past few years, extensive research has been devoted to enhancing YOLO object detectors. Since its introduction, eight major versions of YOLO have been introduced with the purpose of improving its accuracy and efficiency. While the evident merits of YOLO have yielded to its extensive use in many areas, deploying it on resource-limited devices poses challenges. To address this issue, various neural network compression methods have been developed, which fall under three main categories, namely network pruning, quantization, and knowledge distillation. The fruitful outcomes of utilizing model compression methods, such as lowering memory usage and inference time, make them favorable, if not necessary, for deploying large neural networks on hardware-constrained edge devices. In this review paper, our focus is on pruning and quantization due to their comparative modularity. We categorize them and analyze the practical results of applying those methods to YOLOv5. By doing so, we identify gaps in adapting pruning and quantization for compressing YOLOv5, and provide future directions in this area for further exploration. Among several versions of YOLO, we specifically choose YOLOv5 for its excellent trade-off between recency and popularity in literature. This is the first specific review paper that surveys pruning and quantization methods from an implementation point of view on YOLOv5. Our study is also extendable to newer versions of YOLO as implementing them on resource-limited devices poses the same challenges that persist even today. This paper targets those interested in the practical deployment of model compression methods on YOLOv5, and in exploring different compression techniques that can be used for subsequent versions of YOLO.Comment: 18 pages, 7 Figure

    Out-of-distribution detection using inter-level features of deep neural networks

    No full text
    The work presented in this thesis addresses the problem of Out-of-Distribution (OOD) detection in deep learning-based classifiers. The emphasis is to investigate real-world settings, where both the In-distribution (ID) and OOD samples are hard to distinguish due to their high semantic similarities. Despite the significant performance of classification networks, they fail to correctly handle OOD samples. During training, the closed-world assumption is heavily relied upon. It is assumed that models are only forwarded samples that share similar distributions to the training data. In dynamic environments, the assumption is strict and leads to severe degradation in performance. While uncertainty quantification methods are elegant solutions to represent predictive confidence, the presented analysis of said methods illustrates multiple challenges in OOD detection. Motivated by the shortcomings of uncertainty-based detectors, a novel mechanism is proposed to efficiently detect OOD samples. The mechanism leverages features extracted from intermediate layers of a classifier and examines their activations. These activations are distinctive for ID samples and can be utilized to distinguish dissimilar data, even if they are classified to the same label. To verify the performance, the proposed method was first amalgamated with an uncertainty-based classifier and tested to detect OOD samples selected to be similar to the ID samples. The method significantly outperforms the uncertainty threshold-based method for OOD detection. Furthermore, the approach was implemented with common classifiers and has shown improved performance on different common OOD datasets.Applied Science, Faculty ofEngineering, School of (Okanagan)Graduat

    DOPESLAM: High-Precision ROS-Based Semantic 3D SLAM in a Dynamic Environment

    No full text
    Recent advancements in deep learning techniques have accelerated the growth of robotic vision systems. One way this technology can be applied is to use a mobile robot to automatically generate a 3D map and identify objects within it. This paper addresses the important challenge of labeling objects and generating 3D maps in a dynamic environment. It explores a solution to this problem by combining Deep Object Pose Estimation (DOPE) with Real-Time Appearance-Based Mapping (RTAB-Map) through means of loose-coupled parallel fusion. DOPE’s abilities are enhanced by leveraging its belief map system to filter uncertain key points, which increases precision to ensure that only the best object labels end up on the map. Additionally, DOPE’s pipeline is modified to enable shape-based object recognition using depth maps, allowing it to identify objects in complete darkness. Three experiments are performed to find the ideal training dataset, quantify the increased precision, and evaluate the overall performance of the system. The results show that the proposed solution outperforms existing methods in most intended scenarios, such as in unilluminated scenes. The proposed key point filtering technique has demonstrated an improvement in the average inference speed, achieving a speedup of 2.6× and improving the average distance to the ground truth compared to the original DOPE algorithm

    Deep Learning Sensor Fusion for Autonomous Vehicle Perception and Localization : A Review

    No full text
    Autonomous vehicles (AV) are expected to improve, reshape, and revolutionize the future of ground transportation. It is anticipated that ordinary vehicles will one day be replaced with smart vehicles that are able to make decisions and perform driving tasks on their own. In order to achieve this objective, self-driving vehicles are equipped with sensors that are used to sense and perceive both their surroundings and the faraway environment, using further advances in communication technologies, such as 5G. In the meantime, local perception, as with human beings, will continue to be an effective means for controlling the vehicle at short range. In the other hand, extended perception allows for anticipation of distant events and produces smarter behavior to guide the vehicle to its destination while respecting a set of criteria (safety, energy management, traffic optimization, comfort). In spite of the remarkable advancements of sensor technologies in terms of their effectiveness and applicability for AV systems in recent years, sensors can still fail because of noise, ambient conditions, or manufacturing defects, among other factors; hence, it is not advisable to rely on a single sensor for any of the autonomous driving tasks. The practical solution is to incorporate multiple competitive and complementary sensors that work synergistically to overcome their individual shortcomings. This article provides a comprehensive review of the state-of-the-art methods utilized to improve the performance of AV systems in short-range or local vehicle environments. Specifically, it focuses on recent studies that use deep learning sensor fusion algorithms for perception, localization, and mapping. The article concludes by highlighting some of the current trends and possible future research directions.Applied Science, Faculty ofNon UBCEngineering, School of (Okanagan)ReviewedFacult
    corecore